Probabilistic XML: Models and Complexity

نویسندگان

  • Benny Kimelfeld
  • Pierre Senellart
چکیده

Uncertainty in data naturally arises in various applications, such as data integration and Web information extraction. Probabilistic XML is one of the concepts that have been proposed to model and manage various kinds of uncertain data. In essence, a probabilistic XML document is a compact representation of a probability distribution over ordinary XML documents. Various models of probabilistic XML provide different languages, with various degrees of expressiveness, for such compact representations. Beyond representation, probabilistic XML systems are expected to support data management in a way that properly reflects the uncertainty. For instance, query evaluation entails probabilistic inference, and update operations need to properly change the entire probability space. Efficiently and effectively accomplishing data-management tasks in that manner is a major technical challenge. This chapter reviews the literature on probabilistic XML. Specifically, this chapter discusses the probabilistic XML models that have been proposed, and the complexity of query evaluation therein. Also discussed are other data-management tasks like updates and compression, as well as systemic and implementation aspects.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On the Connections between Relational and XML Probabilistic Data Models

A number of uncertain data models have been proposed, based on the notion of compact representations of probability distributions over possible worlds. In probabilistic relational models, tuples are annotated with probabilities or formulae over Boolean random variables. In probabilistic XML models, XML trees are augmented with nodes that specify probability distributions over their children. Bo...

متن کامل

Matching Twigs in Probabilistic XML

Evaluation of twig queries over probabilistic XML is investigated. Projection is allowed and, in particular, a query may be Boolean. It is shown that for a well-known model of probabilistic XML, the evaluation of twigs with projection is tractable under data complexity (whereas in other probabilistic data models, projection is intractable). Under queryand-data complexity, the problem becomes in...

متن کامل

Value Joins are Expensive over ( Probabilistic ) XML . Extended Version

We address the cost of adding value joins to tree-pattern queries and monadic second-order queries over trees in terms of the tractability of query evaluation over two data models: XML and probabilistic XML. Our results show that the data complexity rises from linear, for joinfree queries, to intractable, for queries with value joins, while combined complexity remains essentially the same. For ...

متن کامل

KRDB Research Centre Technical Report : Value Joins are Expensive over ( Probabilistic ) XML . Extended

We address the cost of adding value joins to tree-pattern queries and monadic second-order queries over trees in terms of the tractability of query evaluation over two data models: XML and probabilistic XML. Our results show that the data complexity rises from linear, for joinfree queries, to intractable, for queries with value joins, while combined complexity remains essentially the same. For ...

متن کامل

A Probabilistic Approach to XML Data Management

Uncertainty is ubiquitous in data and can take various forms. Usually, this is not formally taken into account: only the most likely data interpretation is kept for future processing, or all probable choices of correct information above a threshold are maintained. We claim this is not sufficient. There is a need for managing the imprecision in data more rigorously, and the current thesis addres...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013